554 research outputs found

    CrowdTruth 2.0: Quality Metrics for Crowdsourcing with Disagreement

    Full text link
    Typically crowdsourcing-based approaches to gather annotated data use inter-annotator agreement as a measure of quality. However, in many domains, there is ambiguity in the data, as well as a multitude of perspectives of the information examples. In this paper, we present ongoing work into the CrowdTruth metrics, that capture and interpret inter-annotator disagreement in crowdsourcing. The CrowdTruth metrics model the inter-dependency between the three main components of a crowdsourcing system -- worker, input data, and annotation. The goal of the metrics is to capture the degree of ambiguity in each of these three components. The metrics are available online at https://github.com/CrowdTruth/CrowdTruth-core

    Empirical Methodology for Crowdsourcing Ground Truth

    Full text link
    The process of gathering ground truth data through human annotation is a major bottleneck in the use of information extraction methods for populating the Semantic Web. Crowdsourcing-based approaches are gaining popularity in the attempt to solve the issues related to volume of data and lack of annotators. Typically these practices use inter-annotator agreement as a measure of quality. However, in many domains, such as event detection, there is ambiguity in the data, as well as a multitude of perspectives of the information examples. We present an empirically derived methodology for efficiently gathering of ground truth data in a diverse set of use cases covering a variety of domains and annotation tasks. Central to our approach is the use of CrowdTruth metrics that capture inter-annotator disagreement. We show that measuring disagreement is essential for acquiring a high quality ground truth. We achieve this by comparing the quality of the data aggregated with CrowdTruth metrics with majority vote, over a set of diverse crowdsourcing tasks: Medical Relation Extraction, Twitter Event Identification, News Event Extraction and Sound Interpretation. We also show that an increased number of crowd workers leads to growth and stabilization in the quality of annotations, going against the usual practice of employing a small number of annotators.Comment: in publication at the Semantic Web Journa

    Social Agency as a Continuum

    Get PDF
    This work was supported by the UKRI Biotechnology and Biological Sciences Research Council (BBSRC) grant number BB/M010996/1 to Crystal Silver (“Mechanisms of Social Agency”), and by a Carnegie Trust Research Incentive Grant to Bert Timmermans and Ramakrishna Chakravarthi ("Experiencing myself through you: Self-agency in social interaction" - RIG008270)Peer reviewedPublisher PD

    Viewpoint Diversity in Search Results

    Get PDF
    Adverse phenomena such as the search engine manipulation effect (SEME), where web search users change their attitude on a topic following whatever most highly-ranked search results promote, represent crucial challenges for research and industry. However, the current lack of automatic methods to comprehensively measure or increase viewpoint diversity in search results complicates the understanding and mitigation of such effects. This paper proposes a viewpoint bias metric that evaluates the divergence from a pre-defined scenario of ideal viewpoint diversity considering two essential viewpoint dimensions (i.e., stance and logic of evaluation). In a case study, we apply this metric to actual search results and find considerable viewpoint bias in search results across queries, topics, and search engines that could lead to adverse effects such as SEME. We subsequently demonstrate that viewpoint diversity in search results can be dramatically increased using existing diversification algorithms. The methods proposed in this paper can assist researchers and practitioners in evaluating and improving viewpoint diversity in search results.</p

    Reference genes for QRT-PCR tested under various stress conditions in Folsomia candida and Orchesella cincta (Insecta, Collembola)

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genomic studies measuring transcriptional responses to changing environments and stress currently make their way into the field of evolutionary ecology and ecotoxicology. To investigate a small to medium number of genes or to confirm large scale microarray studies, Quantitative Reverse Transcriptase PCR (QRT-PCR) can achieve high accuracy of quantification when key standards, such as normalization, are carefully set. In this study, we validated potential reference genes for their use as endogenous controls under different chemical and physical stresses in two species of soil-living Collembola, <it>Folsomia candida </it>and <it>Orchesella cincta</it>. Treatments for <it>F. candida </it>were cadmium exposure, phenanthrene exposure, desiccation, heat shock and pH stress, and for <it>O. cincta </it>cadmium, desiccation, heat shock and starvation.</p> <p>Results</p> <p>Eight potential reference genes for <it>F. candida </it>and seven for <it>O. cincta </it>were ranked by their stability per stress factor using the programs geNorm and Normfinder. For <it>F. candida </it>the succinate dehydrogenase (<it>SDHA</it>) and eukaryotic transcription initiation factor 1A (<it>ETIF</it>) genes were found the most stable over the different treatments, while for <it>O. cincta</it>, the beta actin (<it>ACTb</it>) and tyrosine 3-monooxygenase (<it>YWHAZ</it>) genes were the most stable.</p> <p>Conclusion</p> <p>We present a panel of reference genes for two emerging ecological genomic model species tested under a variety of treatments. Within each species, different treatments resulted in differences in the top stable reference genes. Moreover, the two species differed in suitable reference genes even when exposed to similar stresses. This might be attributed to dissimilarity of physiology. It is vital to rigorously test a panel of reference genes for each species and treatment, in advance of relative quantification of QRT-PCR gene expression measurements.</p

    Collembase: a repository for springtail genomics and soil quality assessment

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Environmental quality assessment is traditionally based on responses of reproduction and survival of indicator organisms. For soil assessment the springtail <it>Folsomia candida </it>(Collembola) is an accepted standard test organism. We argue that environmental quality assessment using gene expression profiles of indicator organisms exposed to test substrates is more sensitive, more toxicant specific and significantly faster than current risk assessment methods. To apply this species as a genomic model for soil quality testing we conducted an EST sequencing project and developed an online database.</p> <p>Description</p> <p>Collembase is a web-accessible database comprising springtail (<it>F. candida</it>) genomic data. Presently, the database contains information on 8686 ESTs that are assembled into 5952 unique gene objects. Of those gene objects ~40% showed homology to other protein sequences available in GenBank (blastx analysis; non-redundant (nr) database; expect-value < 10<sup>-5</sup>). Software was applied to infer protein sequences. The putative peptides, which had an average length of 115 amino-acids (ranging between 23 and 440) were annotated with Gene Ontology (GO) terms. In total 1025 peptides (~17% of the gene objects) were assigned at least one GO term (expect-value < 10<sup>-25</sup>). Within Collembase searches can be conducted based on BLAST and GO annotation, cluster name or using a BLAST server. The system furthermore enables easy sequence retrieval for functional genomic and Quantitative-PCR experiments. Sequences are submitted to GenBank (Accession numbers: <ext-link ext-link-type="gen" ext-link-id="EV473060">EV473060</ext-link> – <ext-link ext-link-type="gen" ext-link-id="EV481745">EV481745</ext-link>).</p> <p>Conclusion</p> <p>Collembase <url>http://www.collembase.org</url> is a resource of sequence data on the springtail <it>F. candida</it>. The information within the database will be linked to a custom made microarray, based on the Agilent platform, which can be applied for soil quality testing. In addition, Collembase supplies information that is valuable for related scientific disciplines such as molecular ecology, ecogenomics, molecular evolution and phylogenetics.</p

    Engineering Orthogonal Polypeptide GalNAc-Transferase and UDP-Sugar Pairs

    Get PDF
    O-Linked α-N-acetylgalactosamine (O-GalNAc) glycans constitute a major part of the human glycome. They are difficult to study because of the complex interplay of 20 distinct glycosyltransferase isoenzymes that initiate this form of glycosylation, the polypeptide N-acetylgalactosaminyltransferases (GalNAc-Ts). Despite proven disease relevance, correlating the activity of individual GalNAc-Ts with biological function remains challenging due to a lack of tools to probe their substrate specificity in a complex biological environment. Here, we develop a “bump–hole” chemical reporter system for studying GalNAc-T activity in vitro. Individual GalNAc-Ts were rationally engineered to contain an enlarged active site (hole) and probed with a newly synthesized collection of 20 (bumped) uridine diphosphate N-acetylgalactosamine (UDP-GalNAc) analogs to identify enzyme–substrate pairs that retain peptide specificities but are otherwise completely orthogonal to native enzyme–substrate pairs. The approach was applicable to multiple GalNAc-T isoenzymes, including GalNAc-T1 and -T2 that prefer nonglycosylated peptide substrates and GalNAcT-10 that prefers a preglycosylated peptide substrate. A detailed investigation of enzyme kinetics and specificities revealed the robustness of the approach to faithfully report on GalNAc-T activity and paves the way for studying substrate specificities in living systems

    Socio-Economic and Governance Conditions Corresponding to Change in Animal Agriculture: South Dakota Case Study

    Get PDF
    Understanding sustainable livestock production requires consideration of both qualitative and quantitative factors in a temporal and/or spatial frame. This study adapted Qualitative Comparative Analysis (QCA) to relate conditions of social, economic, and governance factors to changes in livestock inventory across several counties and over time. This paper presents an approach that (1) identified factors with the potential to relate to a change in livestock inventory and (2) analyzed commonalities within these factors related to changes spatially and temporally. This paper illustrates the approach and results when applied to five counties in eastern South Dakota. The specific response variables were periods of increasing, no change, or decreasing beef cattle, dairy cattle, and swine inventories in the specific counties for five-year census periods between 1992 and 2017. In the spatial analysis of counties, stable beef inventories and decreasing dairy inventories related to counties with increasing gross domestic products. The presence of specific social communities related to increases in county swine inventories. In the temporal analysis of census periods, local governance and economic factors, particularly market price influences, were more prevalent. Swine inventory showed a stronger link to cash crop markets than to livestock markets, whereas cattle market price increases associated with stable inventories for all animal types. Local governance tools had mixed effects for the different animal types across space and time. The factors and analysis results are context-specific. However, the process considers the various socio-economic processes in livestock production and community development applicable to agricultural sustainability questions in the Midwest and beyond

    HIV infection and stroke:current perspectives and future directions

    Get PDF
    HIV infection can result in stroke via several mechanisms, including opportunistic infection, vasculopathy, cardioembolism, and coagulopathy. However, the occurrence of stroke and HIV infection might often be coincidental. HIV-associated vasculopathy describes various cerebrovascular changes, including stenosis and aneurysm formation, vasculitis, and accelerated atherosclerosis, and might be caused directly or indirectly by HIV infection, although the mechanisms are controversial. HIV and associated infections contribute to chronic inflammation. Combination antiretroviral therapies (cART) are clearly beneficial, but can be atherogenic and could increase stroke risk. cART can prolong life, increasing the size of the ageing population at risk of stroke. Stroke management and prevention should include identification and treatment of the specific cause of stroke and stroke risk factors, and judicious adjustment of the cART regimen. Epidemiological, clinical, biological, and autopsy studies of risk, the pathogenesis of HIV-associated vasculopathy (particularly of arterial endothelial damage), the long-term effects of cART, and ideal stroke treatment in patients with HIV are needed, as are antiretrovirals that are without vascular risk

    Arctic Sea Ice Decline Significantly Contributed to the Unprecedented Liquid Freshwater Accumulation in the Beaufort Gyre of the Arctic Ocean

    Get PDF
    The Beaufort Gyre (BG) is the largest liquid freshwater reservoir of the Arctic Ocean. The liquid freshwater content (FWC) significantly increased in the BG in the 2000s during an anticyclonic wind regime and remained at a high level despite a transition to a more cyclonic state in the early 2010s. It is not well understood to what extent the rapid sea ice decline during this period has modified the trend and variability of the BG liquid FWC in the past decade. Our numerical simulations show that about 50% of the liquid freshwater accumulated in the BG in the 2000s can be explained by the sea ice decline caused by the Arctic atmospheric warming. Among this part of the FWC increase, 60% can be attributed to surface freshening associated with the reduction of the net sea ice thermodynamic growth rate, and 40% to changes in ocean circulation, which makes freshwater more accessible to the BG for storage. Thus, the rapid increase of the BG FWC in the 2000s was due to the concurrence of the anticyclonic wind regime and the high freshwater availability. We also find that if the Arctic sea ice had not declined, the liquid FWC in the BG would have shown a stronger decreasing tendency at the beginning of the 2010s owing to the cyclonic wind regime. From our results we argue that changes in sea ice conditions should be adequately taken into account when it comes to understanding and predicting variations of BG liquid FWC in a changing climate
    corecore